HAC-models: a novel approach to continuous speech recognition
نویسنده
چکیده
In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech is described by co-occurrence statistics of acoustic events over an analysis window of variable length, leading to a vectorial representation of high but fixed dimension called “Histogram of Acoustic Co-occurrence” (HAC). During training, recurring acoustic patterns are discovered and associated to words through non-negative matrix factorisation. During testing, word activations are computed from the HACrepresentation and their time of occurrence is estimated. Hence, words in a continuous utterance can be detected, ordered and located.
منابع مشابه
Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملDetecting words in speech using linear separability in a bag-of-events vector space
This paper studies the properties of the Histograms of Acoustic Co-occurrences (HAC) approach to acoustic modeling. While HAC-vectors have been predominantly used with matrix decomposition algorithms, we show that the additivity and sparseness constraints inherent in HAC lead to a representational space in which utterances are linearly separable with respect to the words they contain. The metho...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کامل